Devnagari document segmentation using histogram approach

نویسندگان

  • Vikas J. Dongre
  • Vijay H. Mankar
چکیده

Document segmentation is one of the critical phases in machine recognition of any language. Correct segmentation of individual symbols decides the accuracy of character recognition technique. It is used to decompose image of a sequence of characters into sub images of individual symbols by segmenting lines and words. Devnagari is the most popular script in India. It is used for writing Hindi, Marathi, Sanskrit and Nepali languages. Moreover, Hindi is the third most popular language in the world. Devnagari documents consist of vowels, consonants and various modifiers. Hence proper segmentation of Devnagari word is challenging. A simple histogram based approach to segment Devnagari documents is proposed in this paper. Various challenges in segmentation of Devnagari script are also discussed.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multiple Classifier Combination for Off-line Handwritten Devnagari Character Recognition

This work presents the application of weighted majority voting technique for combination of classification decision obtained from three Multi_Layer Perceptron(MLP) based classifiers for Recognition of Handwritten Devnagari characters using three different feature sets. The features used are intersection, shadow feature and chain code histogram features. Shadow features are computed globally for...

متن کامل

Devnagari Numerals Classification and Recognition Using an Integrated Approach

Character recognition has always been a challenging field for the researchers. There has been an astounding progress in the development of the systems for character recognition. OCR performs the recognition of the text in the scanned document image and converts it into editable form. The OCR process can have several stages like preprocessing, segmentation, recognition and post processing. The r...

متن کامل

Application of Statistical Features in Handwritten Devnagari Character Recognition

In this paper a scheme for offline Handwritten Devnagari Character Recognition is proposed, which uses different feature extraction methodologies and recognition algorithms. The proposed system assumes no constraints in writing style or size. First the character is preprocessed and features namely : Chain code histogram and moment invariant features are extracted and fed to Multilayer Perceptro...

متن کامل

Indus Image Segmentation Using Watershed and Histogram Projections

Character segmentation is the major step of document image analysis and optical character recognition (OCR). The character segmentation is necessary to detect all the character regions in the image document. The proposed method preprocesses the image document with edge detection techniques to enhance the character edges. Further, the watershed algorithm is implemented to identify the regions of...

متن کامل

Recognition of Off-Line Handwritten Devnagari Characters Using Quadratic Classifier

Recognition of handwritten characters is a challenging task because of the variability involved in the writing styles of different individuals. In this paper we propose a quadratic classifier based scheme for the recognition of offline Devnagari handwritten characters. The features used in the classifier are obtained from the directional chain code information of the contour points of the chara...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1109.1247  شماره 

صفحات  -

تاریخ انتشار 2011